AITopics | manager agent

Collaborating Authors

manager agent

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

f7ae4fe91d96f50abc2211f09b6a7e49-Paper-Conference.pdf

Neural Information Processing SystemsFeb-18-2026, 18:04:09 GMT

large language model, machine learning, reinforcement learning, (19 more...)

Neural Information Processing Systems

Country:

North America > United States (0.14)
Asia > Middle East > Jordan (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Media (1.00)
Information Technology (1.00)
Banking & Finance > Trading (1.00)
Leisure & Entertainment (0.92)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.98)
(5 more...)

Add feedback

Benefits and Limitations of Communication in Multi-Agent Reasoning

Rizvi-Martel, Michael, Bhattamishra, Satwik, Rathi, Neil, Rabusseau, Guillaume, Hahn, Michael

arXiv.org Artificial IntelligenceOct-17-2025

Chain-of-thought prompting has popularized step-by-step reasoning in large language models, yet model performance still degrades as problem complexity and context length grow. By decomposing difficult tasks with long contexts into shorter, manageable ones, recent multi-agent paradigms offer a promising near-term solution to this problem. However, the fundamental capacities of such systems are poorly understood. In this work, we propose a theoretical framework to analyze the expressivity of multi-agent systems. We apply our framework to three algorithmic families: state tracking, recall, and $k$-hop reasoning. We derive bounds on (i) the number of agents required to solve the task exactly, (ii) the quantity and structure of inter-agent communication, and (iii) the achievable speedups as problem size and context scale. Our results identify regimes where communication is provably beneficial, delineate tradeoffs between agent count and bandwidth, and expose intrinsic limitations when either resource is constrained. We complement our theoretical analysis with a set of experiments on pretrained LLMs using controlled synthetic benchmarks. Empirical outcomes confirm the tradeoffs between key quantities predicted by our theory. Collectively, our analysis offers principled guidance for designing scalable multi-agent reasoning systems.

artificial intelligence, deep learning, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2510.13903

Country:

Europe (0.68)
North America > United States (0.67)
Asia > Middle East > UAE (0.28)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Traj-CoA: Patient Trajectory Modeling via Chain-of-Agents for Lung Cancer Risk Prediction

Zeng, Sihang, Fu, Yujuan, Zhou, Sitong, Yu, Zixuan, Liu, Lucas Jing, Wen, Jun, Thompson, Matthew, Etzioni, Ruth, Yetisgen, Meliha

arXiv.org Artificial IntelligenceOct-14-2025

Large language models (LLMs) offer a generalizable approach for modeling patient trajectories, but suffer from the long and noisy nature of electronic health records (EHR) data in temporal reasoning. To address these challenges, we introduce Traj-CoA, a multi-agent system involving chain-of-agents for patient trajectory modeling. Traj-CoA employs a chain of worker agents to process EHR data in manageable chunks sequentially, distilling critical events into a shared long-term memory module, EHRMem, to reduce noise and preserve a comprehensive timeline. A final manager agent synthesizes the worker agents' summary and the extracted timeline in EHRMem to make predictions. In a zero-shot one-year lung cancer risk prediction task based on five-year EHR data, Traj-CoA outperforms baselines of four categories. Analysis reveals that Traj-CoA exhibits clinically aligned temporal reasoning, establishing it as a promisingly robust and generalizable approach for modeling complex patient trajectories.

large language model, machine learning, traj-coa, (18 more...)

arXiv.org Artificial Intelligence

2510.10454

Country: North America > United States (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Health Care Technology > Medical Record (1.00)
Health & Medicine > Therapeutic Area > Oncology > Lung Cancer (0.89)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

f7ae4fe91d96f50abc2211f09b6a7e49-Paper-Conference.pdf

Neural Information Processing SystemsOct-10-2025, 21:48:43 GMT

agent, arxiv preprint arxiv, language model, (14 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > New York > New York County > New York City (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Media (1.00)
Information Technology (1.00)
Banking & Finance > Trading (1.00)
Leisure & Entertainment (0.92)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.98)
(5 more...)

Add feedback

Orchestrating Human-AI Teams: The Manager Agent as a Unifying Research Challenge

Masters, Charlie, Vellanki, Advaith, Shangguan, Jiangbo, Kultys, Bart, Gilmore, Jonathan, Moore, Alastair, Albrecht, Stefano V.

arXiv.org Artificial IntelligenceOct-6-2025

While agentic AI has advanced in automating individual tasks, managing complex multi-agent workflows remains a challenging problem. This paper presents a research vision for autonomous agentic systems that orchestrate collaboration within dynamic human-AI teams. We propose the Autonomous Manager Agent as a core challenge: an agent that decomposes complex goals into task graphs, allocates tasks to human and AI workers, monitors progress, adapts to changing conditions, and maintains transparent stakeholder communication. We formalize workflow management as a Partially Observable Stochastic Game and identify four foundational challenges: (1) compositional reasoning for hierarchical decomposition, (2) multi-objective optimization under shifting preferences, (3) coordination and planning in ad hoc teams, and (4) governance and compliance by design. To advance this agenda, we release MA-Gym, an open-source simulation and evaluation framework for multi-agent workflow orchestration. Evaluating GPT-5-based Manager Agents across 20 workflows, we find they struggle to jointly optimize for goal completion, constraint adherence, and workflow runtime - underscoring workflow management as a difficult open problem. We conclude with organizational and ethical implications of autonomous management systems.

large language model, machine learning, manager agent, (18 more...)

arXiv.org Artificial Intelligence

2510.02557

Country:

North America > United States (0.46)
Europe > United Kingdom > England > Greater London > London (0.15)

Genre: Workflow (1.00)

Industry:

Law (1.00)
Government (1.00)
Banking & Finance (0.93)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(2 more...)

Add feedback

When Does Divide and Conquer Work for Long Context LLM? A Noise Decomposition Framework

Xu, Zhen, Zhu, Shang, Wang, Jue, Wang, Junlin, Athiwaratkun, Ben, Wang, Chi, Zou, James, Zhang, Ce

arXiv.org Artificial IntelligenceJun-23-2025

We investigate the challenge of applying Large Language Models (LLMs) to long texts. We propose a theoretical framework that distinguishes the failure modes of long context tasks into three categories: cross-chunk dependence (task noise), confusion that grows with context size (model noise), and the imperfect integration of partial results (aggregator noise). Under this view, we analyze when it is effective to use multi-agent chunking, i.e., dividing a length sequence into smaller chunks and aggregating the processed results of each chunk. Our experiments on tasks such as retrieval, question answering, and summarization confirm both the theoretical analysis and the conditions that favor multi-agent chunking. By exploring superlinear model noise growth with input length, we also explain why, for large inputs, a weaker model configured with chunk-based processing can surpass a more advanced model like GPT4o applied in a single shot. Overall, we present a principled understanding framework and our results highlight a direct pathway to handling long contexts in LLMs with carefully managed chunking and aggregator strategies.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2506.16411

Country:

Europe > Austria (0.28)
Asia (0.28)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

SheetMind: An End-to-End LLM-Powered Multi-Agent Framework for Spreadsheet Automation

Zhu, Ruiyan, Cheng, Xi, Liu, Ke, Zhu, Brian, Jin, Daniel, Parihar, Neeraj, Xu, Zhoutian, Gao, Oliver

arXiv.org Artificial IntelligenceJun-17-2025

We present SheetMind, a modular multi-agent framework powered by large language models (LLMs) for spreadsheet automation via natural language instructions. The system comprises three specialized agents: a Manager Agent that decomposes complex user instructions into subtasks; an Action Agent that translates these into structured commands using a Backus Naur Form (BNF) grammar; and a Reflection Agent that validates alignment between generated actions and the user's original intent. Integrated into Google Sheets via a Workspace extension, SheetMind supports real-time interaction without requiring scripting or formula knowledge. Experiments on benchmark datasets demonstrate an 80 percent success rate on single step tasks and approximately 70 percent on multi step instructions, outperforming ablated and baseline variants. Our results highlight the effectiveness of multi agent decomposition and grammar based execution for bridging natural language and spreadsheet functionalities.

artificial intelligence, large language model, natural language, (17 more...)

arXiv.org Artificial Intelligence

2506.12339

Country: North America > United States > California > Alameda County > Berkeley (0.14)

Genre: Research Report (0.70)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

CritiQ: Mining Data Quality Criteria from Human Preferences

Guo, Honglin, Lv, Kai, Guo, Qipeng, Liang, Tianyi, Xi, Zhiheng, Song, Demin, Zhang, Qiuyinzhe, Sun, Yu, Chen, Kai, Qiu, Xipeng, Gui, Tao

arXiv.org Artificial IntelligenceFeb-26-2025

Language model heavily depends on high-quality data for optimal performance. Existing approaches rely on manually designed heuristics, the perplexity of existing models, training classifiers, or careful prompt engineering, which require significant expert experience and human annotation effort while introduce biases. We introduce CritiQ, a novel data selection method that automatically mines criteria from human preferences for data quality with only $\sim$30 human-annotated pairs and performs efficient data selection. The main component, CritiQ Flow, employs a manager agent to evolve quality criteria and worker agents to make pairwise judgments. We build a knowledge base that extracts quality criteria from previous work to boost CritiQ Flow. Compared to perplexity- and classifier- based methods, verbal criteria are more interpretable and possess reusable value. After deriving the criteria, we train the CritiQ Scorer to give quality scores and perform efficient data selection. We demonstrate the effectiveness of our method in the code, math, and logic domains, achieving high accuracy on human-annotated test sets. To validate the quality of the selected data, we continually train Llama 3.1 models and observe improved performance on downstream tasks compared to uniform sampling. Ablation studies validate the benefits of the knowledge base and the reflection process. We analyze how criteria evolve and the effectiveness of majority voting.

criteria, criterion, zhang, (14 more...)

arXiv.org Artificial Intelligence

2502.19279

Country:

Asia > Middle East > Jordan (0.04)
North America > United States (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
(6 more...)

Genre: Research Report (0.64)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Simulating Classroom Education with LLM-Empowered Agents

Zhang, Zheyuan, Zhang-Li, Daniel, Yu, Jifan, Gong, Linlu, Zhou, Jinchang, Liu, Zhiyuan, Hou, Lei, Li, Juanzi

arXiv.org Artificial IntelligenceJun-27-2024

Large language models (LLMs) have been employed in various intelligent educational tasks to assist teaching. While preliminary explorations have focused on independent LLM-empowered agents for specific educational tasks, the potential for LLMs within a multi-agent collaborative framework to simulate a classroom with real user participation remains unexplored. In this work, we propose SimClass, a multi-agent classroom simulation framework involving user participation. We recognize representative class roles and introduce a novel class control mechanism for automatic classroom teaching, and conduct user experiments in two real-world courses. Utilizing the Flanders Interactive Analysis System and Community of Inquiry theoretical frame works from educational analysis, we demonstrate that LLMs can simulate traditional classroom interaction patterns effectively while enhancing user's experience. We also observe emergent group behaviors among agents in SimClass, where agents collaborate to create enlivening interactions in classrooms to improve user learning process. We hope this work pioneers the application of LLM-empowered multi-agent systems in virtual classroom teaching.

agent, classroom, interaction, (15 more...)

arXiv.org Artificial Intelligence

2406.19226

Country:

Europe > Belgium > Flanders (0.25)
North America > United States > Alaska > Anchorage Municipality > Anchorage (0.04)
Asia > Singapore (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre:

Research Report (1.00)
Instructional Material > Course Syllabus & Notes (0.94)

Industry:

Education > Educational Setting > Online (0.68)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

MAGIC: Generating Self-Correction Guideline for In-Context Text-to-SQL

Askari, Arian, Poelitz, Christian, Tang, Xinye

arXiv.org Artificial IntelligenceJun-18-2024

Self-correction in text-to-SQL is the process of prompting large language model (LLM) to revise its previously incorrectly generated SQL, and commonly relies on manually crafted self-correction guidelines by human experts that are not only labor-intensive to produce but also limited by the human ability in identifying all potential error patterns in LLM responses. We introduce MAGIC, a novel multi-agent method that automates the creation of the self-correction guideline. MAGIC uses three specialized agents: a manager, a correction, and a feedback agent. These agents collaborate on the failures of an LLM-based method on the training set to iteratively generate and refine a self-correction guideline tailored to LLM mistakes, mirroring human processes but without human involvement. Our extensive experiments show that MAGIC's guideline outperforms expert human's created ones. We empirically find out that the guideline produced by MAGIC enhance the interpretability of the corrections made, providing insights in analyzing the reason behind the failures and successes of LLMs in self-correction. We make all agent interactions publicly available to the research community, to foster further research in this area, offering a synthetic dataset for future explorations into automatic self-correction guideline generation.

agent, guideline, sql, (15 more...)

arXiv.org Artificial Intelligence

2406.12692

Country:

Europe > Netherlands > South Holland > Leiden (0.04)
Europe > Belgium > Brussels-Capital Region > Brussels (0.04)
Asia > Myanmar > Tanintharyi Region > Dawei (0.04)

Genre: Research Report > New Finding (0.67)

Industry: Leisure & Entertainment (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback